AITopics | non-autoregressive machine translation

Collaborating Authors

non-autoregressive machine translation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Non-autoregressive Machine Translation with Probabilistic Context-free Grammar Shangtong Gui

Neural Information Processing SystemsFeb-8-2026, 00:44:16 GMT

V anilla NA T model assumes that the target tokens are independent of each other given the source sentence. This assumption enables the parallel generation of all tokens.

artificial intelligence, natural language, translation, (16 more...)

Neural Information Processing Systems

Country:

Asia > China > Hong Kong (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(8 more...)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Non-autoregressive Machine Translation with Probabilistic Context-free Grammar

Neural Information Processing SystemsDec-23-2025, 22:58:54 GMT

Non-autoregressive Transformer(NAT) significantly accelerates the inference of neural machine translation. However, conventional NAT models suffer from limited expression power and performance degradation compared to autoregressive (AT) models due to the assumption of conditional independence among target tokens. To address these limitations, we propose a novel approach called PCFG-NAT, which leverages a specially designed Probabilistic Context-Free Grammar (PCFG) to enhance the ability of NAT models to capture complex dependencies among output tokens. Experimental results on major machine translation benchmarks demonstrate that PCFG-NAT further narrows the gap in translation quality between NAT and AT models. Moreover, PCFG-NAT facilitates a deeper understanding of the generated sentences, addressing the lack of satisfactory explainability in neural machine translation.

name change, non-autoregressive machine translation, probabilistic context-free grammar, (4 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Non-autoregressive Machine Translation with Probabilistic Context-free Grammar

Neural Information Processing SystemsOct-9-2024, 20:11:46 GMT

neural machine translation, non-autoregressive machine translation, probabilistic context-free grammar, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Fuzzy Alignments in Directed Acyclic Graph for Non-Autoregressive Machine Translation

Ma, Zhengrui, Shao, Chenze, Gui, Shangtong, Zhang, Min, Feng, Yang

arXiv.org Artificial IntelligenceJul-17-2023

Non-autoregressive translation (NAT) reduces the decoding latency but suffers from performance degradation due to the multi-modality problem. Recently, the structure of directed acyclic graph has achieved great success in NAT, which tackles the multi-modality problem by introducing dependency between vertices. However, training it with negative log-likelihood loss implicitly requires a strict alignment between reference tokens and vertices, weakening its ability to handle multiple translation modalities. In this paper, we hold the view that all paths in the graph are fuzzily aligned with the reference sentence. We do not require the exact alignment but train the model to maximize a fuzzy alignment score between the graph and reference, which takes captured translations in all modalities into account. Extensive experiments on major WMT benchmarks show that our method substantially improves translation performance and increases prediction confidence, setting a new state of the art for NAT on the raw training data.

machine learning, natural language, translation, (18 more...)

arXiv.org Artificial Intelligence

2303.06662

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(16 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Rephrasing the Reference for Non-Autoregressive Machine Translation

Shao, Chenze, Zhang, Jinchao, Zhou, Jie, Feng, Yang

arXiv.org Artificial IntelligenceNov-30-2022

Non-autoregressive neural machine translation (NAT) models suffer from the multi-modality problem that there may exist multiple possible translations of a source sentence, so the reference sentence may be inappropriate for the training when the NAT output is closer to other translations. In response to this problem, we introduce a rephraser to provide a better training target for NAT by rephrasing the reference sentence according to the NAT output. As we train NAT based on the rephraser output rather than the reference sentence, the rephraser output should fit well with the NAT output and not deviate too far from the reference, which can be quantified as reward functions and optimized by reinforcement learning. Experiments on major WMT benchmarks and NAT baselines show that our approach consistently improves the translation quality of NAT. Specifically, our best variant achieves comparable performance to the autoregressive Transformer, while being 14.7 times more efficient in inference.

artificial intelligence, natural language, rephraser, (17 more...)

arXiv.org Artificial Intelligence

2211.16863

Country:

North America > Puerto Rico (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Asia > China (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

No Language Left Behind: Scaling Human-Centered Machine Translation

NLLB Team, null, Costa-jussà, Marta R., Cross, James, Çelebi, Onur, Elbayad, Maha, Heafield, Kenneth, Heffernan, Kevin, Kalbassi, Elahe, Lam, Janice, Licht, Daniel, Maillard, Jean, Sun, Anna, Wang, Skyler, Wenzek, Guillaume, Youngblood, Al, Akula, Bapi, Barrault, Loic, Gonzalez, Gabriel Mejia, Hansanti, Prangthip, Hoffman, John, Jarrett, Semarley, Sadagopan, Kaushik Ram, Rowe, Dirk, Spruit, Shannon, Tran, Chau, Andrews, Pierre, Ayan, Necip Fazil, Bhosale, Shruti, Edunov, Sergey, Fan, Angela, Gao, Cynthia, Goswami, Vedanuj, Guzmán, Francisco, Koehn, Philipp, Mourachko, Alexandre, Ropers, Christophe, Saleem, Safiyyah, Schwenk, Holger, Wang, Jeff

arXiv.org Artificial IntelligenceAug-25-2022

Driven by the goal of eradicating language barriers on a global scale, machine translation has solidified itself as a key focus of artificial intelligence research today. However, such efforts have coalesced around a small subset of languages, leaving behind the vast majority of mostly low-resource languages. What does it take to break the 200 language barrier while ensuring safe, high quality results, all while keeping ethical considerations in mind? In No Language Left Behind, we took on this challenge by first contextualizing the need for low-resource language translation support through exploratory interviews with native speakers. Then, we created datasets and models aimed at narrowing the performance gap between low and high-resource languages. More specifically, we developed a conditional compute model based on Sparsely Gated Mixture of Experts that is trained on data obtained with novel and effective data mining techniques tailored for low-resource languages. We propose multiple architectural and training improvements to counteract overfitting while training on thousands of tasks. Critically, we evaluated the performance of over 40,000 different translation directions using a human-translated benchmark, Flores-200, and combined human evaluation with a novel toxicity benchmark covering all languages in Flores-200 to assess translation safety. Our model achieves an improvement of 44% BLEU relative to the previous state-of-the-art, laying important groundwork towards realizing a universal translation system.

massively multilingual machine translation model, massively multilingual neural machine translation, statistically significant human evaluation improvement, (14 more...)

arXiv.org Artificial Intelligence

2207.04672

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.13)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.13)
Europe > Italy > Tuscany > Florence (0.04)
(49 more...)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)
Overview (1.00)
(2 more...)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
(6 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback